Automatic Construction of a Semantic Knowledge Base from CEUR Workshop Proceedings

نویسندگان

  • Bahar Sateli
  • René Witte
چکیده

We present an automatic workflow that performs text segmentation and entity extraction from scientific literature to primarily address Task 2 of the Semantic Publishing Challenge 2015. The goal of Task 2 is to extract various information from full-text papers to represent the context in which a document is written, such as the affiliation of its authors and the corresponding funding bodies. Our proposed solution is composed of two subsystems: (i) A text mining pipeline, developed based on the GATE framework, which extracts structural and semantic entities, such as authors' information and references, and produces semantic (typed) annotations; and (ii) a flexible exporting module, the LODeXporter, which translates the document annotations into RDF triples according to custom mapping rules. Additionally, we leverage existing Named Entity Recognition (NER) tools to extract named entities from text and ground them to their corresponding resources on the Linked Open Data cloud, thus, briefly covering Task 3 objectives, which involves linking of detected entities to resources in existing open datasets. The output of our system is an RDF graph stored in a scalable TDB-based storage with a public SPARQL endpoint for the task's queries. URL http://www.springer.com/us/book/9783319255170 [25] DOI 10.1007/978-3-319-25518-7 [26] Copyright Copyright © Springer International Publishing Switzerland 2015. This is the author's version of the work. It is posted here by permission of Springer for your personal use. Not for redistribution. Attachment Size sempub2015_poster.pdf [27] 1.52 MB sempub_challenge2015.pdf [28] 903.62 KB Semantics for the Masses Except where otherwise noted, all original content on this site is copyright by its author and licensed under a Creative Commons Attribution-Share Alike 2.5 Canada License. Source URL (retrieved on 2017-09-06 06:53): http://www.semanticsoftware.info/biblio/automatic-construction-semantic-knowledge-base-ceur-workshopproceedings Links: [1] http://www.semanticsoftware.info/users/bahar [2] http://www.semanticsoftware.info/category/blog-tags/digital-libraries [3] http://www.semanticsoftware.info/category/blog-tags/knowledge-base [4] http://www.semanticsoftware.info/taxonomy/term/418 [5] http://www.semanticsoftware.info/taxonomy/term/391 [6] http://www.semanticsoftware.info/category/blog-tags/natural-language-processing [7] http://www.semanticsoftware.info/category/blog-tags/rdf [8] http://www.semanticsoftware.info/category/blog-tags/scholarly-literature [9] http://www.semanticsoftware.info/taxonomy/term/419 [10] http://www.semanticsoftware.info/category/blog-tags/semantic-publishing [11] http://www.semanticsoftware.info/category/blog-tags/semantic-web [12] http://www.semanticsoftware.info/category/blog-tags/text-mining [13] http://www.semanticsoftware.info/taxonomy/term/420 [14] http://www.semanticsoftware.info/category/topic/text-mining [15] http://www.semanticsoftware.info/biblio/author/73 [16] http://www.semanticsoftware.info/biblio/author/1 [17] http://www.semanticsoftware.info/biblio/keyword/106 [18] http://www.semanticsoftware.info/biblio/keyword/108 [19] http://www.semanticsoftware.info/biblio/keyword/16 [20] http://www.semanticsoftware.info/biblio/keyword/105 [21] http://www.semanticsoftware.info/biblio/keyword/107 [22] http://www.semanticsoftware.info/biblio/keyword/104 [23] http://www.semanticsoftware.info/biblio/keyword/2 [24] http://www.semanticsoftware.info/biblio/keyword/19 [25] http://www.springer.com/us/book/9783319255170 [26] http://dx.doi.org/10.1007/978-3-319-25518-7 [27] http://www.semanticsoftware.info/system/files/sempub2015_poster.pdf [28] http://www.semanticsoftware.info/system/files/sempub_challenge2015.pdf

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Semantify CEUR-WS Proceedings: Towards the Automatic Generation of Highly Descriptive Scholarly Publishing Linked Datasets

Rich and fine-grained semantic information describing varied aspects of scientific productions is essential to support their diffusion as well as to properly assess the quality of their output. To foster this trend, in the context of the ESWC2014 Semantic Publishing Challenge, we present a system that automatically generates rich RDF datasets from CEUR-WS workshop proceedings. Proceedings are a...

متن کامل

Automatic Hashtag Recommendation in Social Networking and Microblogging Platforms Using a Knowledge-Intensive Content-based Approach

In social networking/microblogging environments, #tag is often used for categorizing messages and marking their key points. Also, since some social networks such as twitter apply restrictions on the number of characters in messages, #tags can serve as a useful tool for helping users express their messages. In this paper, a new knowledge-intensive content-based #tag recommendation system is intr...

متن کامل

NAACL - HLT 2012 Proceedings of the Joint Workshop on Automatic Knowledge Base Construction and Web - scale Knowledge Extraction

Probabilistic knowledge bases are commonly used in areas such as large-scale information extraction, data integration, and knowledge capture, to name but a few. Inference in probabilistic knowledge bases is a computationally challenging problem. With this contribution, we present our vision of a distributed inference algorithm based on conflict graph construction and hypergraph sampling. Early ...

متن کامل

Automatic Construction of Persian ICT WordNet using Princeton WordNet

WordNet is a large lexical database of English language, in which, nouns, verbs, adjectives, and adverbs are grouped into sets of cognitive synonyms (synsets). Each synset expresses a distinct concept. Synsets are interlinked by both semantic and lexical relations. WordNet is essentially used for word sense disambiguation, information retrieval, and text translation. In this paper, we propose s...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015